-
Notifications
You must be signed in to change notification settings - Fork 9.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix tx buffer inconsistency if there are unordered key writes in one tx. #17263
Conversation
/retest |
188163b
to
ea46154
Compare
Hi @siyuanfoundation you can try to limit cpu by the following command and use patch #17248
Hope it can help. |
ea46154
to
a6132e5
Compare
Thank you @fuweid, it did help! |
9816f82
to
f6a5621
Compare
1abf2f9
to
440a140
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
440a140
to
9b4acde
Compare
Please read #17247 (comment) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
9b4acde
to
9d4d3af
Compare
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The change looks good to me, it fixes a clear problem and adds regression tests, however #17228 did the same. Can we take some additional steps to ensure we prevent another regression? I don't we can depend on reviewers trying to analyse and find edge cases as it failed before. One idea I have is to do comparison testing between buffered and unbuffered transaction. What do you think? |
1cd20f4
to
003a32a
Compare
cc @ahrtr
After the PR:
|
34de0c8
to
1667df0
Compare
cc @serathius |
Can you use benchstat ? https://sourcegraph.com/blog/go/gophercon-2019-optimizing-go-code-without-a-blindfold |
Ok. I ran
Here are the results: (they are basically the same)
After
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
Thanks
Benchstat do not change the results, it just makes them readable and makes it easy to check if they are statistically significant If you provide both old and new benchmark results to benchstat, it will provide also the performance delta and p-value |
} | ||
if !isSeq { | ||
shuffleList(ks) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: When benchmark needs some input data, it's best to avoid not include it in the measurements of the benchmark to avoid noise.
You can do that by preparing the input data first, and then calling b.ResetTimer()
before running the code you want to benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's great to have a regression tests, however I'm thinking that the scenarios we are covering are so intricate and complicated that it scares me. If there is another issue hiding in the transaction logic, we don't have any chance of finding it using this approach.
1667df0
to
27fc1a0
Compare
Signed-off-by: Siyuan Zhang <sizhang@google.com>
Signed-off-by: Siyuan Zhang <sizhang@google.com>
…bucket. Signed-off-by: Siyuan Zhang <sizhang@google.com>
27fc1a0
to
0a54362
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks for adding benchmark test suites.
Reran benchstat twice with count=10,
run 2:
|
So in the updated benchmark results you can see that it causes ~2% regression in performance. Still I think impact on overall etcd performance should be not noticable and we should pick correctness here. |
Recommit #17228 with buf fix.
#17247 was fixed by reverting #17228. But the underlying problem with the buffer should still be fixed.
No failure locally with
and just the
[RaftAfterSaveSnapPanic, MemberReplace]
Failpoints.Please read https://github.com/etcd-io/etcd/blob/main/CONTRIBUTING.md#contribution-flow.